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Abstract 

Shared registers are basic objects used as communication mediums in asynchronous 
concurrent computation. A concurrent timestamp system is a higher typed communication 
object, and has been shown to be a powerful tool to solve many concurrency control 
problems. It has turned out to be possible to construct such higher typed objects from 
primitive lower typed ones. The next step is to find efficient constructions. We propose 
a very efficient wait-free construction of bounded concurrent timestamp systems from 1- 
writer multireader registers. This finalizes, corrects, and extends, a preliminary bounded 
multiwriter construction proposed by the second author in 1986. That work partially 
initiated the current interest in wait-free concurrent objects, and introduced a notion of 
discrete vector clocks in distributed algorithms. 



Categories and Subject Descriptors: B.3.2 [Memory Structures]: Design Styles — 
shared memory; B.4.3 [Input/Output and Data Communications]: Interconnections 
(subsystems) — asynchronous / synchronous operation] D.1.3 [Programming Techniques]: 
Concurrent Programming; D.4.1 [Operating Systems]: Process Management — concur- 
rency, multiprocessing /multiprogramming; D.4.4 [Operating Systems]: Communications 
Management — buffering; 
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1 Introduction 



Consider a system of asynchronous processes that communicate among themselves by execut- 
ing read and write operations on a set of shared variables (also known as shared registers) 
only. The system has no global clock or any synchronization primitives. Every shared variable 
is associated with a process (called owner) which writes it and the other processes may read 
it. An execution of a write (read) operation on a shared variable will be referred to as a Write 
{Read) on that variable. A Write on a shared variable puts a value from a pre determined 
finite domain into the variable, and a Read reports a value from the domain. A process that 
writes (reads) a variable is called a writer (reader) of the variable. 

Wait-free shared variable: We want to construct shared variables in which the following 
two properties hold. (1) Operation executions are not necessarily atomic, that is, they are 
not indivisible, and (2) every operation finishes its execution within a bounded number of its 
own steps, irrespective of the presence of other operation executions and their relative speeds. 
That is, operation executions are wait-free. These two properties give rise to a classification of 
shared variables, depending on their output characteristics. Lamport [29| distinguishes three 



categories for 1-writer shared variables, using a precedence relation on operation executions 
defined as follows: for operation executions A and B, A precedes B, denoted A — B, if 
A finishes before B starts; A and B overlap if neither A precedes B nor B precedes A. In 
1-writer variables, all the Writes are totally ordered by " — >" . The three categories of 1-writer 
shared variables defined by Lamport are the following. 

1. A safe variable is one in which a Read not overlapping any Write returns the most 
recently written value. A Read that overlaps a Write may return any value from the 
domain of the variable. 

2. A regular variable is a safe variable in which a Read that overlaps one or more Writes 
returns either the value of the most recent Write preceding the Read or of one of the 
overlapping Writes. 
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3. An atomic variable is a regular variable in which the Reads and Writes behave as if they 
occur in some total order which is an extension of the precedence relation. 



A shared variable is 6oo/ean0 or multivalued depending upon whether it can hold only two 
or more than two values. 

Mult iwr iter shared variable: A multiwriter shared variable is one that can be written and 



read (concurrently) by many processes. Lamport |29| constructed a shared variable that could 
be written by one process and read by one other process, but he did not consider constructions 
of shared variables with more than one writer or reader. Vitanyi and Awerbuch [^] were the 
first to construct an atomic multiwriter shared variable from 1-writer variables. They propose 
two constructions: one from 1-writer multireader shared variables using bounded control 
information that turned out to be incorrect |^3[ (just regular and not atomic as claimed), 
and the other from 1-writer 1-reader variables using unbounded control information. The 



latter construction is correct. It is made bounded in |31], yielding one of the most optimal 
implementations that are currently known. (In this paper we correct and extend the first 
construction to obtain an efficient version of the more general notion of bounded concurrent 
timestamp system as defined below.) Related work is |]^, ^, ^ 17, 19, 23, 24, 2^, 3C, 31, 34, 
|35| , |36| , 37, 4C]. In particular, it is now possible to construct bounded multiwriter atomic 
variables from 1-writer 1-reader safe bits. See |31|, and the last section of this paper, for a 
brief history of the subject. 

Timestamp system: In a multiwriter shared variable it is only required that every process 
keeps track of which process wrote last. There arises the general question whether every 
process can keep track of the order of the last Writes by all processes. This idea was formalized 



by Israeli and Li |21]. They introduced and analyzed the notion of timestamp system as an 
abstraction of such a higher typed communication medium. In a timestamp system every 
process owns an object, an abstraction of a set of shared variables. One of the requirements 
of the system is to determine the temporal order in which the objects are written. For this 
purpose, each object is given a label (also refer to as timestamp) which indicates the latest 
(relative) time when it has been written by its owner process. The processes assign labels to 
their respective objects in such a way that the labels reflect the real-time order in which they 
are written to. These systems must support two operations, namely labeling and scan. A 



^Boolean variables are referred to as bits. 
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labeling operation execution (Labeling, in short) assigns a new label to an object, and a scan 
operation execution (Scan, in short) enables a process to determine the ordering in which all 
the objects are written, that is, it returns a set of labeled-objects ordered temporally. We are 
concerned with those systems where operations can be executed concurrently, in an overlapped 
fashion. Moreover, operation executions must be wait-free, that is, each operation execution 
will take a bounded number of its own steps (the number of accesses to the shared space), 
irrespective of the presence of other operation executions and their relative speeds. 

Wait-free constructions of concurrent timestamp systems (CTSs, in short) have been shown 
to be a powerful tool for solving concurrency control problems such as /c/s-mutual exclusion 
m, multiwriter multireader shared variables |^], probabilistic consensus ||2|, ^, fcfs l- 
exclusion |12| by synthesizing a "wait- free clock" to sequence the actions in a concurrent 
system. 

Here, we are interested in constructing concurrent timestamp systems using 1-writer shared 
variables. It is not difficult to construct a timestamp system if the shared space is unbounded 
(there is no limit on the size of some shared variables). The problem gets much harder for 
bounded (shared space) systems. A bounded timestamp system is a timestamp system with 
a finite set of bounded size labels. In the rest of the paper, unless stated otherwise, by a 
timestamp system we mean a wait-free bounded concurrent timestamp system. 

Israeli and Li ||2l| constructed a bit-optimal bounded timestamp system for sequential 
operation executions. The concurrent case of bounded timestamp system is harder and the 
first generally accepted solution is due to Dolev and Shavit 0. Their construction is of the 



type as in |21| and uses shared variables of size 0{n), where n is the number of processes 
in the system. Each Labeling requires 0(n) steps, and each Scan 0(n^ log n) steps. In their 
construction, no Scan writes any shared variables: It is a 'pure' reading operation execution. 
(But, by the theorem of Lamport |2^, page 91], all such constructions become de facto im- 
pure if we break them down to the lowest level of system building.) Following Dolev and 



Shavit, several researchers have come up with other constructions. Israeli and Pinhasov |22] 
use shared variables of size 0{n?); Labeling and Scan require 0{n) steps. Gawlik, Lynch and 
Shavit |14] use shared variables of size O(n^); Labeling and Scan access O(nlogn) shared 
variables. In pn], Dwork and Waarts introduce a powerful communication abstraction called 



"traceable use abstraction" to recycle values of shared variables. They demonstrate the use- 
fulness of the abstraction by constructing a CTS, borrowing the basic ideas and techniques 
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from ||4^ for recycling private values. Their construction requires shared variables of size 
O(nlogn); Labeling and Scan require 0(n) steps. Later, they along with Herlihy and Plotkin 
pA[| propose a construction using shared variables of size 0(n); Labeling and Scan access 
0{n) shared variables. Unlike the Israeli-Li and Dolev-Shavit constructions, Scans in other 
proposed constructions are not pure; they write a lot of shared space. 

Our result and related work: Among the constructions mentioned above, the one of 
Dwork and Waarts |10| is relatively simple and efficient as well^. They introduce "traceable 
use abstraction" to bound the size of labels. Like in |^2|, each label is a vector of n private 
values, one for each of n processes. Using a strategy similar to, and extending, [^], the 
abstraction helps each process to keep track of its private values that are in use in the system. 
At any point in time, a process can use only a bounded number of private values of another 
process. Exploiting that feature, the abstraction helps in bounding the set of private values 
needed. The labels are read by executing a traceable-read function, and written by executing 
a traceable-write procedure. When the traceable-read function is executed to read a label, the 
executing process explicitly informs all other processes which of their private values it is going 
to use. A process can find which of its private values are in use by other processes even if the 
values propagate through these processes in tandem one after another. To determine which of 
its private values are currently not in use, a process executes a garbage collection routine. This 
routine helps processes to safely recycle their respective private values that are not in use. 
These three routines are at the heart of implementing the traceable use abstraction. Dwork 



and Waarts [10| have shown how these routines are used in constructing a bounded concurrent 
timestamp system. The most intricate among these routines is the garbage collection, whose 
time complexity is 0{n?) that could be, though nonstandard, uniformly amortized over O(n^) 
labeling operation executions. To achieve this, each process needs to maintain a private, 
separate, pool of 22n^ private values. The costliest part of their construction is the use of 
multireader 'order' variables. The construction uses, for each process, Q{n) sets of 22n-many 
1-writer n-reader atomic variables of size 0(nlogn) bits each. Let us roughly estimate their 
space complexity at the fundamental level, i.e., at the level of 1-writer 1-reader safe bits. (To 
implement a 1-writer n-reader atomic variable of size m bits, the constructions in [^, |40t| 
together require 3mn 1-writer 1-reader safe bits, 2n 1-writer 1-reader atomic bits and one 1- 
writer n-reader atomic bit. Each 1-writer 1-reader atomic bit can be implemented from 0(1) 



We find it is the easiest one to understand; also see comments in US by Yakovlev. 
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1-writer 1-reader safe bits |16, 29, 3^, 41]. A 1-writer n-reader atomic bit can be implemented 
from O(n^) safe bits |16]. Thus, we require a total of 3mn + 0(n^) 1-writer 1-reader safe bits 
to implement a 1-writer n-reader atomic variables of size m bits.) Thus, there is a need of 
at least O(n^logn) bits at the fundamental level just for the order variables in each process. 
Consequently, we need at least O(n^logn) 1-writer 1-reader safe bits for all order variables of 
all processes. In addition, there are other shared variables for the processes. 



The bounded multiwriter shared variable construction of Vitanyi and Awerbuch |4^, while 
falling short of the claimed atomicity , has brought into prominence many techniques that 
were used later in wait-free computing. An example is the idea of a label as a vector of n 
individual clocks.^ (In vector entries are called 'tickets'.) Even better, it turns out that 
the corrected version presented here suffices to implement the higher communication object 
type of bounded CTS. The current paper is the final version of the pioneering preliminary |]4^ , 
and its correction [^] . Dwork and Waarts [|lO| , without stating this explicitly, used the idea of 
(bounded) vector clocks and other techniques introduced in , and hence their solution bears 



a close resemblance to the construction proposed here (and, in fact, to other constructions |35, 
|3^ based on ||4^). On the other hand, our construction uses some ideas from their traceable 
use abstraction. We observe that in CTSs the propagation of private values is restricted to 
only one level of indirection, and not to arbitrary levels. Consequently, the propagation of 
private values can be tracked down by their respective owner processes with relative ease. And, 
the one level indirect propagation of private values by other processes need not be informed to 
the original owner of these private values. Thus, one doesn't need the complete power of the 
traceable use abstraction for constructing a CTS. In our construction, we use less powerful 
traceable-read and traceable-write. But, we prefer to use the same function/procedure names 



of [10| just keep conformity with the literature. We do not require a garbage collection routine, 
thereby simplifying the proposed CTS construction and its correctness proof considerably. 
When a process executes the traceable-read function, it does not explicitly inform the other 
processes which of their private values it is going to use. On the other hand, the executers 
of the traceable-write procedure correctly find which private values of which processes are 
in use in the system. Another important point is that, in our construction, a Scan writes 
a limited amount of information, only 0{n) 1-writer 1-reader bits. Also, each local pool of 



^ The concept of vector clock is used in many areas of distributed computing, all in related contexts, to 
keep track of execution evolution in distributed systems. (Cf. The articles by Mattern[B2, p3|.) 
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private values contains fewer than 2n^ values. We use a total of 'n? 0(n log n) bit size 1-reader 
1-writer regular order variables, requiring a total of O(n'^logn) safe 1-reader 1-writer bits at 
the fundamental level. Both the scan and labeling operation executions require 0{n) steps 
in terms of the shared variables used. But in our construction, a Scan reads at most (n — 1) 
1-writer 1-reader regular order variables, whereas in their construction it is (2n — 2) 1-writer 
n-reader atomic ones. Thus, at the fundamental level they scan order of magnitude more bits 
than we do. 

Our construction is not optimal in terms of the usage of shared space (Cf. Table |l] in Sec- 
tion ^). It is perhaps possible to use a bounded set of global values and to recycle them instead 
of using private values. Recycling of global values could lead to an optimal construction. 

The remainder of this paper is organized as follows. Section |^ discusses the system model 
and presents the problem statement precisely. A new construction of concurrent timestamp 
systems is presented in Section ^, and its correctness proof in Section ^. Section |5| concludes 
the paper. 



2 Model, Problem Definition, and some Notations 

A concurrent bounded timestamp system (CTS, in short) is an abstract communication system 
for n completely asynchronous processes Pi, . . . It consists of n objects 0[l..n], each of 
finite space representation, and supports two operations, namely labeling and scan(ing). A 
labeling operation execution (Labeling, in short) of process Pp assigns a new label to object 
0[p]. It may use all existing labels of ©[L.n], but it is not allowed to change the labels of 
components other than 0\p]. A scan operation execution (Scan, in short) enables a process to 
determine the ordering in which all the objects are written, that is, it returns a set of labeled- 
objects ordered temporally^. It returns a pair (I, -<), where 7 is a set of current labels, one for 
each object-component, and ^ is a total order on I. Operation executions of each process are 
sequential. However, operation executions of different processes need not be sequential, i.e., 
they might overlap. 

Let us denote the kth operation execution (Labeling or Scan) of a process Pp by 
k > 1. If it is a Scan (Labeling), we denote it explicitly by sjf^ (L^p^)- The label written by a 
labeling operation execution Lp^' is denoted by lp^\ 
*We ignore, in this paper, the data values of the objects. 
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For operation executions A and B on a shared variable, A B means that the execution 
of A starts before that of B finishes. That is, if A B, then either A — > B ov A overlaps 
B; in other words, B -/-^ A. We also assume that if B -/-^ A, then A --- B. That is, we 



assume the global time model |29]. 



A concurrent timestamp system must ensure the following properties [^, Q. 

PI. Ordering: There exists an irreflexive total order =^ on the set of all labeling operation 
executions, such that the following two conditions hold. 

• Precedence: For every pair of Labelings Ljf^ and Lq' \ if l'p^ — > L^q ' then L|f' =^ 

• Consistency: For every Scan 5| returning ^), for every two labels Ip and Iq 



in I, if < /f] iff Lf ^ Lf 'J 



\k\ - \i\ \k\ [7I 

P2. Regularity: For every label Vp in / returned by a Scan \ Lp begins before 

terminates, i.e., Lf S^^, and there is no Labeling L^f ^ such that — > L]p ' — > 

P3. Monotonicity: Let s]^' and 5?/'' be a pair of Scans returning sets I and respectively, 
which contain labels lp and lp , respectively. If S^f — > S^, , then k < k . 

\k] [7I [fc'l 

P4. Extended Regularity: Let lp he a label returned by a Scan . For each Labeling Lq , 
if ^Lfl, then LifULf. 



The intuitive meaning of the above four properties is as follows. The ordering property 
says that all the labeling operation executions can be totally ordered which is an extension of 
their real-time precedence order " — Moreover, if two different Scans return labels / and 
then both Scans will have the same order on the labels. The regularity property says that 
labels returned by a Scan are not obsolete. The monotonicity property says that for every 
two Scans ordered by " — >" , it is not the case that the preceding Scan returns a new label 
of a process Pp and the succeeding Scan an old label of Pp. The monotonicity property does 



not imply that labeling and scan operation executions of all processes are linearizable |2C]. It 
does imply the linearizability of the Scans of all processes and labeling operation executions of 
a single process M. The extended regularity property says that if a Scan precedes a labeling 
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operation execution L, then all labels returned by the Scan were assigned by labeling operation 
executions that precede L in 

We are interested in those CTSs in which operation executions are wait- free, that is, each 
operation execution will take a bounded number of its own steps (a step is a read/write of a 
shared variable), irrespective of the presence of other operation executions and their relative 
speeds. This paper is concerned with implementing wait-free CTSs from basic 1-writer 1- 
reader shared variables. 

3 The Construction 

For the sake of convenience and better understanding, we first present an intuitive informal 
description of a construction that uses unbounded shared space ||4^ (the same idea is used in 
||lO[| ). Each process maintains a separate local pool of private values that are natural numbers 
with the standard order relations on them. 

A label is a vector of n values ('tickets' in [^]); its pth component holds a private value of 
process Pp. The current label of 0[p\ is denoted by Zp[l..n] or simply Ip. The current private 
value of process Pp is lp\p\. Initially, lp[p\ = 1 and lp[q\ = 0, for all q ^ p. To determine a 
new label for 0[p\, process Pp reads all current private values of other processes Pq, namely 
lq[q\, and increments its own private value lp[p\ by one to obtain the new private value. The 
new label vector contains these n values, and it is written atomically in 0[p\. Since the same 
private value is not used twice in labeling operation executions, no two labels ever produced 
in the system are the same. The ordering of two label vectors is done by using the standard 
lexicographic (dictionary) order -<: for every two labels, Ip ^ Iq, the least significant index 
in which they differ is the lowest k such that lp[k] ^ lq[k]; then, Ip ~< Iq iff lp[k] < lq[k]. 
This lexicographic order ^ is a total order on the set of all possible labels [p!^ ], and this fact 
is a static common knowledge to the processes. (In fact, -< is an elementary example of a 
well-ordered relation.) A Scan simply reads all the current labels and orders them using the 
lexicographic order. This unbounded construction satisfies all the properties required for a 
concurrent timestamp system (Cf. |lO| ). 

In the unbounded construction discussed above, every time a process Pk executes a new 
labeling operation, it uses a new private value greater than the previously used ones. In a 
bounded construction, each process has only a bounded number of private values, and hence, 
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it needs to use the same private value at different times, that is, it needs to recycle its own 
private values. The following observation (which is a synthesis of the text in |42, page 236]) by 
Dwork and Waarts helps doing the recycling in some possible way. We quote them verbatim: 



. . . for a system to be a concurrent timestamp system, every time a new private 
value chosen by process Pk need not be the one that was never used by Pk be- 
forehand; roughly speaking, instead of increasing its private value, it is enough for 
Pfc to take as its new private value any value v of its private values that does not 
appear in any labels, with one proviso: P^ must inform the other processes that v 
is to be considered larger than all its other private values currently in use. 

Consequently, we cannot use the standard ordering relations on the natural numbers any 
more, for the numbers may be recycled repeatedly. One has now to consider these numbers 
as mere symbols with no standard ordering relations defined on them. We define for every 
two different private values v and v' of process P^ currently in use in the system, v -<k v' iff 
V is issued before v' by Pk- Thus, in the bounded construction, the ordering relation among 
the private values changes in time, and hence it cannot be a priori common knowledge. Note 
that at any point in time, the relation -<k on the values in use is a total order as the values are 
produced in sequences, and in fact, it is well-ordered. For every two labels, Ip ^ Iq, obtained 
by a Scan, if k is the least significant index such that lp[k] ^ then we define Ip -< Iq iff 
Ipik] -^k lq[k]- Then, -< is also a well-ordered relation Now, we are concerned with two 
things in a bounded construction. First, to make the relations -<k useful, processes P^ cannot 
recycle a private value if some other processes are using it. Second, for every two private 
values V and v' of Pk currently in use, if v -<k v' then all other processes should (get to) know 
this ordering before using these values. Note that the meaning of < on the natural numbers is 
a static common knowledge, but the meaning of -<k changes continually. Thus, every time Pk 
changes the ordering of two different private values, it should inform all the other processes 
well in advance. Then, for all labels read by a Scan, the labels are ordered lexicographically, 
based on the orderings -<k of all processes Pk- Then, the correctness of the bounded system 
trivially follows from that of the unbounded system mentioned above (given in p2| , p!c|]). 



In the following paragraphs, we present a novel construction, based on ||42|, 15], to achieve 
the afore mentioned two objectives. The construction is given in Figure |l|. 

We now introduce some terminology. The description of the construction has five parts: 
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shared variables declaration, TRACEABLE- WRITE procedure, TRACEABLE-READ func- 
tion, LABELING procedure and SCAN function. The procedures and the functions are writ- 
ten in a Pascal-type language. To avoid too many 'begin's and 'end's, some blocks are shown 
just by indentation. All the statements in the four routines are numbered only for reference 
purposes. 

A base shared variable x is read (respectively, written) by executing an instruction 'read 
local-variable from x' (respectively, 'write local-variable in x'), where the local-variable is local 
to the function or the procedure. The read-instruction assigns the value of x to the local- 
variable, and the write-instruction writes the value of the local-variable in x. The writer 
(owner) of a shared variable can retain the value of the variable in its local storage and refer 
to it later on if needed, that is, it need not read the shared variable to determine the current 
value of the variable. Nevertheless, for the sake of convenience and to avoid using many local 
variables, we let the writer also read its own shared variable. It also uses some private (local, 
non-shared) variables for each process. We assume that the private variables are persistent. 

Let us consider operation executions of a particular process Pp. Process Pp executes the 
LABELING procedure to obtain and assign a new label to 0[p], and executes the SCAN 
function to report the temporal ordering of the labels of 0[l..n]. In a labeling operation exe- 
cution, it selects a presently unused private value from its local pool of values (Statements 1-2 
in the LABELING procedure), collects the current private values of all other processes (State- 
ments 5-6), and then writes these n values atomically in 0[p] as its new label (Statement 7). 
The selection of a new private value is done in such a way that there is no trace of this value in 
the system at present. In a scan operation execution, process Pp first reads the current labels 
of all the processes (Statement 1 in the SCAN function), and then determines their temporal 
ordering using the latest ordering information available from some ordering shared variables 
(Statement 2). 

The collection of the current private values of other processes is done by executing the 
TRACEABLE-READ function, and the writing of the new label is done by executing the 
TRACEABLE- WRITE procedure.^ These two routines collectively implement atomic reading 
and writing of labels from and into objects 0[p]. (In rest of the paper, an execution of the 
TRACEABLE-READ function (TRACEABLE- WRITE procedure) wih be called a traceable 
Read (traceable Write).) Note that these two routines are not parts of the interface to the 
'^These two routines resemble pretty closely the READ and WRITE routines in |l7[ 
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CTS, and the processes cannot directly invoke them. They directly invoke the LABELING 
and SCAN routines in which they, in turn, invoke traceable Read (Write) to read (write) 
labels. 

A process Pp uses shared variables w\p,l..n], r[p, l..n], c\p], label\p, 0..1] and copylabel[p, l..n] 
to atomically read and write new labels from and into object 0[p]. The label and copylabel 
variables are used to hold labels of 0[p]. w and r are handshake variables used to detect 
overlapping of traceable Reads and Writes. The variable c is used to atomically declare writ- 
ings of new labels in 0[p]. Process Pp uses the shared variables order\p,l..n] to inform all 
the processes of the latest ordering relation -<p. The shared variables lend[p, l..n] are used to 
inform all the processes which of their private values might be in use in the system. The com- 
ponent lend[p,j] contains all the private values of process Pj that Pp may have lent to other 
processes. Process Pp also uses static private variables: dp, myLendp, -<p, and old-labelp. dp 
and myLendp always store the values of c\p] and lend[p, l..n], respectively, locally. -<p contains 
the latest ordering information of all the private values in use in the system, old-labelp stores 
the label of the on-going or the recently completed Labeling operation execution. 

The traceable Writes of process Pp use two n-reader safe main label variables, label[p,0] 
and label[p,l], and a 1-reader safe copy label variable for each process, copylabel[p,l..n]. 
The main label variables are used alternately for writing successive new labels. Lnmediately 
after writing a new label in a main label variable, the process records that variable index 
in the 1-writer multireader boolean atomic variable c\p]. (This writing atomically 'declares' 
the current label of component 0[p].) Then the process checks for each i whether a new 
traceable Read of process Pi started since the last traceable Write (of Pp). This is done by 
using a pair of boolean 1-writer 1-reader (handshaking) atomic variables r[i,p] and t(;[p, z].^ 
Process Pi sets these values different, by assigning the complement of i] to at 
the beginning of each traceable Read (Statements 1-2 in TRACEABLE- READ), and process 
Pp makes sure that they are the same, at the end of each traceable Write (Statements 4.1 
and 4.2.3 in TRACEABLE- WRITE). By this way the processes Pp and Pi can find if there 
are overlappings of their traceable Writes and Reads. Hence if the two values are different 
when the process Pp checks them, a new traceable Read of Pi must have started by then. In 
that case, Pp writes the new label value in copylabel[p, i] also, and then sets the above values 
the same, by assigning the r[i,p] value to i]. (This way it is guaranteed that a reading 

^This strategy of detecting overlapping operation execution is pioneered by Peterson [B4|- 
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and a writing on copylabel variables do not overlap each other, and contains a valid value 



for the traceable Read ||4^, 17, |4^.) For each such process Pj, Pp takes a note which of the 
private values of processes Pj could be used by Pi (Statement 4.2.2). Finally, Pp informs all 
the processes Pj which of their private values could be in use (all that Pp knows of) through 
1-writer 1-reader regular variables lend\p,j] (Statement 6). 

Each traceable Read of process Pp, from a process Pj, after reading w[i,p\ and writing 
its complement in r[p^i] as mentioned above (Statements 1-2 in TRACEABLE- READ), finds 
out from c[i\ the main label variable that has been written by Pj most recently, and reads 
from that variable. Then it reads again and compares with i]. If the two values 

continue to be different, then the reading of the main label variable does not overlap any 
writings of the label variable and hence it returns the value just read from the main label 
variable. Otherwise, there is a possibility that the reading of the label variable overlaps with 
some writing of the same variable, and hence, it reads copylabel[i,p] and returns that value. 
Note that in the latter case, a traceable Write by Pj must have finished (with respect to Pp, 
that is, Pj must have done loop iteration p at Statement 4 in TRACEABLE- WRITE) after 
the traceable Read started, and that Write would have written in copylabel[i,p]. 

In selecting a new (currently unused) private value, process Pp does not use any of the 
values stored in lend[l..n,p] (Statements 1-2 in LABELING). After selecting the new private 
value, say v, Pp informs all processes Pj that v is the most recent private value through 1- 
writer 1-reader regular variables order\p,i] (Statements 3-4) which are used by the Scans of 
Pi. 

4 Correctness Proof 

Proposition 1 |29| For operation executions B and C on a shared variable, and all operation 



executions A and D, if A — > B --- C — > D, then A — > D. 

Proof: The implication follows by the transitivity of (i) A finishes before B starts, (ii) B starts 
before C finishes and (iii) C finishes before D starts. □ 

Definition. For operation executions A and B executed on the same atomic variable x, we 
say A ==^x B \i A precedes B in the total ordering imposed on the operation executions by 
the atomic variable. The subscript x is omitted when it is clear from the context. □ 
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Proposition 2 For operation executions B and C on an atomic variable x, and all operation 
executions A and D, if A — > B =^x C — > D, then A — > D. 

Proof: The relation B C implies B precedes or overlaps C (since the total order imposed 
on the operation executions by the atomic variable is an extension of the precedence relation) , 
that is, B --- C. Then the implication follows by Proposition 1. □ 

The following notations are used in the presentation of the correctness proofs. 

Nl. The A:th operation execution of a process Pp is denoted, as stated in Section ^, by 
ofl(O), A: > 1; if it is a Scan (alternatively, a Labeling), we denote it explicitly by 
S^\o) (alternatively, L}p\o)). The '(O)' part in the notation is omitted when it is 
clear from the context. All the operation executions of Pp are totally ordered. That is, 
for /t > 2, o]f"^^ — > 

N2. For a shared variable x, the Read (respectively, Write) of x by ojf^ is denoted by F^p\x) 

\k] 

(respectively, Wp (x)). If x is referred more than once, then the superscript [k,j] is 
used for the jth access. 

\k] / \k] \k]\ 

N3. Each operation execution Op [Lp or S'p ) of process Pp executes the TRACEABLE- 
READ function for every other process Pi] the whole function execution is denoted by 

a traceable Read TeII^]. 

p,t 

\k] 

N4. Each labeling operation execution Lp of process Pp executes the TRACEABLE- WRITE 

\k] 

procedure; the whole procedure execution is denoted by a traceable Write TWp . 

N5. For the sake of convenience, the variables r\p,i] and ^[p, i] are abbreviated to r^^j and 
Wp^i, respectively. 

Definition. For a shared variable x, we define a reading mapping tTx for Reads of x as follows: 
if a Read R returns the value written by a Write W, then tTx^R) is W; otherwise tTx{R) is 
undefined. (Note, for safe x, tTx is a partial mapping.) We omit the subscript x when it is 
clear from the context. □ 

Lemma 1 (a) No two consecutive labeling operation executions of a process have the same 
private value. 

(b) No two consecutive traceable Writes of a process have the same private value. 
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Proof: Part (a) follows from the select statement (Statement 2) in the LABELING procedure. 
Part (5) follows from Part (a) as each Labeling executes one and the only one traceable 
Write. □ 

Lemma 2 Each time the value written in Wp,i is the complement of the previous value ofwp^i. 

Proof: Immediate from Statements 4.1, 4.2 and 4.2.3 in the TRACEABLE- WRITE procedure. 
□ 

Lemma 3 Any traceable Write TWp (actually, Lp ) that writes Wp^i sets Wp^i = vi^p, and 
if R^-'^\wp^i) Wp'^Wp^i) =^ Rf''^\wp^i) for some traceable Read TPf^^ (actually, of^) of 
process Pi, then the equality continues to hold until the execution ofTPf^^ is complete, in fact 
until the next traceable Read TPf^^^ writes ri^p. 

Proof: Initially, Wp^i = ri^p, since both of them are initialized to 0. Among the traceable 

\k ■] 

Writes of the process Pp, some will write Wp^i, and some will not. Let TWp \ j > 1, kj > 1, 
be the j th traceable Write that writes Wp^i. 

Consider TWp'^K By Lemma ^, it writes 1 in Wp^i. This implies, by Statements 4.1 and 
4.2.3 in TRACEABLE- WRITE, that it read 1 from r^^p. Since the initial value of r^^p is 0, 
some traceable Read of Pi must have written 1 in r^^p. Let TR^^ be the first such traceable 
Read. Then M^/^'(rj^p) =^ Rp'^\ri^p). Note that TPf^ reads from Wp^i and hence writes 1 
in ri^p (Statements 1-2 in TRACEABLE- READ). Also each subsequent traceable Read Tpf^^, 
if any, such that Rf^'^\wp^i) =^ Wp^'^\wp^i), would read from Wp^i, and hence will write 1 
in ri^p. Hence, irrespective of whether wf^\ri^p) =^ Rjp^\ri^p) or ii|f^^(rj^p) =^ W^/^'('^j,p)> 
on Wp'^^Wp^i), Wp^i = Ti^p, and ii R^-'^\wp^i) =^ Wp'^^Wp^i) R^-''^\wp^i) for some traceable 
Read TFlf^^, then the equality continues to hold until Ti^j'^ is complete, in fact until the 
next traceable Read TPu^^^^ writes r^^p, since Wp^i will not be changed by any traceable Write 
TWP\ for k[ > ki, that may occur before TR\ ^ is complete. 

\k ■] 

Assuming as induction hypothesis that the assertion holds for TWp ^ , for some j, we 
show that the assertion holds for TWp ^ . By the statement of the lemma, TWp ■' sets 
Wp^i = ri^p by writing value, say b G {0, 1} in Wp^i. Then, by Lemma TW^'''^^^ writes ^b 
in Wp^i.f\ This imphes by Statements 4.1 and 4.2.3 in TRACEABLE- WRITE, it read ^b from 
is defined as 1 — 6. 
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\k 1 

ri^p. As the value of r^^p is b when TWp reads it, there must be a traceable Read that writes 
-16 in ri^p after TWp sets Wp^i = ri^p. Let TR\ \p be the first such traceable Read. Then, 
wf\ri^p) =^ -Rjf^^^'(?'i,p), and TRf^^ writes -16 in Vi^p. Each subsequent traceable Read TR-'J, 
if any, such that ''^\wp^i) =^ W^^^^\wp^i), would read b from liJp^j, and hence will write -16 
in Ti^p. Hence, irrespective of whether wf ' (rj^p) i^jf^"*"^' {^i,p) or (^j,p) =^ ^ ('^i,?)) 

on Wp^i = Vi^p. If Rf\wp^i) =^ Wp'^^^^^ (li'p.i) =^ Ri''^\'^p,i) foi' some traceable 

Read TR^-^^, then the equality continues to hold until TRf^^ is complete, in fact until the 
next traceable Read TRf^^'^ writes r^^p, since Wp^i will not be changed by any traceable Write 
Twl'' \ for k' > fcj+i, that may occur before TRf^^ is complete. □ 

Lemma |3| implies the following property. 

Lemma 4 Let TRf^^ be a traceable Read. There can be at most one traceable Write, say 
Twjl'l such that Rf''^\wp^i) =^ wjl'^WpA) =^ Rf'^\wp^i). The traceable Read TR^^ on 
Rl ' {wp^i) will find ri^p = Wp^i if there is such a traceable Write, and ri^p ^ Wp^i otherwise. □ 

In the following we use a typical kind of notation for labeling operation executions. 

[fc -1 

N6. The labeling operation executions of process Pp are sometimes denoted by Lp , where 

-[kj-l] 



k is some alphabet and j is a natural number, j > 1, kj > 1. Thus, for j > 1, Lp ^~ and 
L^p'^ are two consecutive labeling operation executions of Pp such that ijf^"^' — > L^p^\ 
They need not be two consecutive operation executions, that is, kj > kj^i + 1. 



In the following two lemmas, we show that traceable Reads return valid label values. We 
also define their reading mapping function tt. Lemmas ^ and ^ deal with the case traceable 
Reads return values from label and copylabel variables, respectively. 

Lemma 5 Let TPf^^ be a traceable Read that finds ri^p ^ Wp^i on Rf''^\wp^i). Suppose 

\l] \k ■] \k ■] \k ■] 

7t{R^\c[p])) isWf'\c[p]) (of the traceable Write TWp' of Up'), and label[p, x] is the main 
label variable from which TPf^^ returns the label value. 

(a) If j' is the least index such that R\ ' {wp^i) =^ Wp ^ {wp,i), then j' equals j or j + 1. 

(b) n{TRfl) isTwl'^'K 

(c) The traceable Read TPf^^ reading label\p,x] does not conflict with any traceable Write 
writing that label variable. 
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Proof: 

[k ■//] 

(a) Let j" be the greatest index such that j" < j' and TWp ^ writes Wp^i. Then by (i) the 
choice of j', (ii) the assumption that TRf^^ finds rj p ^ Wp^i on Rf''^\'Wp^i) and (iii) Lemma it 
fohows that wl'''"\wp,i) =^ i?,pVp,i)- That is, Wp'"\wp,i) =^ B!!'^\wp^i) — > uf 
Wp ^'\wp^i). The traceable Write TW^ ^"^ sets Wp^i equal to r^^p, Ti?!'^ sets rj^p not equal to 
Wp^i, and hence Twj, ^'^ is the first traceable Write, after Twj, that finds r^^p 7^ ti^p^j. 

From (ri,p) ^ Rf\c\p]) =^ wj!"'+'\c\p]) R^'^'kri^p), we have V^f (r^.p) ^ 
i?|f^+^'(rj p). That is, the traceable Write Twj!'^^^'^ will find rj^p 7^ Wp^i, the inequality set by 
Tuf^p, unless an earlier traceable Write has found the inequality and set Wp^i equal to r^^p. We 

\k ■] 

claim that such an earlier traceable Write, if one exists, can only be TWp . Suppose, on the 
contrary, that it is TWp for j'" < j. Then, by the choice of j" and Lemma ^ we have 
w!-^"\wp,) =^ R^''\wp,) I^\c\p]) Rf''\wp,) =^ W^it^kcip]). 
This implies i?f'(c[p]) — > wl'''\c[p]), contradicting the assumption that TT{Rf\c[p])) is 
Wp'^'''(c[p]). The assertion follows. 

{b and c) Let label[p,x'] be the variable in which TW^^^^ writes. 

For / described in part (a), we have R\ {label[p,x]) — > R-' (wp^i) Wp ^ {wp,i) — > 
TW^^^'^\ That is, TRf^^ finishes reading label[p,x\ before the traceable Write TW^^^'^'^ starts 

rn .1 

its execution. From (i) the assumption that 7r(i?- (c[p])) is Wp ^ {c[p\), (ii) the property that 
TWp does not write in the same main label variable that TWp writes, 
(ifi) Wl^'\label\p,x']) — > wl^'\c\p\) =^ Rf\c\p]) — > Rf\label[p,x]), and (iv) State- 
ments 1-3 in TRACE ABLE- WRITE, it follows that x = x', and TW^^'^ finishes writing 
label\p,x] before TR^^^ starts reading it. The assertions follow. □ 

Lemma 6 Let TRf^,^ be a traceable Read that finds Vi^p = Wp^i on Rf''^\wp^i). Suppose TW^''^^ 
is the traceable Write such that Rf'^\wp^i) wjl'^^Wp^i) Rf''^\wp^i). 

(a) The traceable Read TR^j^^ reading copylabel[p,i] does not conflict with any traceable 
Write writing it. 

(b) T:{TRf^p) = TWp\ 

Proof: {a and b) By Lemma ^, Twj!'^^ is the only traceable Write such that Rf'^\wp^i) =^ 
wS''\wp^i) =^ Rf'^\wp^i). It is clear from the TRACEABLE- WRITE procedure that TW^"'^ 
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writes the value in copylabel[p, i] (Statement 4.2.1) before setting the Wp^i and r^^p values equal 
(Statement 4.2.3). This equality will not be changed until Pi starts the next traceable Read. 
Thus, the traceable Write TTyj'^^^^' and subsequent traceable Writes of Pp, if they find r^^p = 

Ik] [fc-1 

Wp^i, will not write the copy label variable. From Wp ^ {copylabel[p,i]) — > Wp ^ (wp^i) =^ 
Rf''^\wp^i) — > Rf\copylabel[p,i]), weliaveWlt'\copylabel\p,i]) — Rf\copylabel[p,i]). The 
assertions follow. □ 

Now we would like to show that private values of processes Pp are traceable. If a process Pi 
in its current label uses a private value v of another process Pp, Pi informs this "using of" v by 
setting /en(i[i,p][l][i] to v at the end of the corresponding traceable Write (Statements 5-6). 
Thus, all the private values in the existing labels are traceable by their respective owners. 
The following lemma shows that the private values used by Scans are also traceable. 

Lemma 7 Let a Scan sf^ of a process Pi use a private value v of a process Pp that has written 
the value v in a traceable Write TWp . Then, Pp does not recycle v until is complete. 

Proof: We need to consider the following two cases. 

Case 1: sf^ got v directly from Pp. 

We need to consider two subcases. 

Subcase a. If the traceable Read TPf^^ returns the value v from copylabel[p,i], then, by 
Lemma ^ and ^, the traceable Write TTyj^-'' has executed the ^/-statement body (State- 
ment 4.2) for process Pi. There it has set myLen(ip[p][l][i] to v (Statement 4.2.2). The 
successive traceable Writes of Pp that occur before sf^ is complete will not execute the if- 
statement, and hence, will not change the myLendp [p] [1] [i] value. (Statement 5 does not 
change the value too.) As the labeling operation executions of Pp do not reuse the values 
referred to in lend[l..n,p], v will not be reissued at least until sf^ is complete (Statements 1-2 
in LABELING). 

Subcase b. If the traceable Read TPf^^ returns the value v from a main label variable, then by 
Lemma ^(a), traceable Write TWp^^'^ or TWp^^'*'^^ executes the i/-statement for process Pi. In 
the case of THp'^^', my-Lendp[p][l][i] is set to v, and in the case of TwJ'^^^^', myLendp [p] [0] [i] is 
set to V (Statements 4.2.2 and 7). The successive traceable Writes of Pp that occur before sf^ is 
complete will not execute the if-statement, and hence, will not change the myLendp[p][0..1][z] 
values. (Statement 5 does not change the values too.) By Lemma |l[ TW^''^'*'^^ uses a private 
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value different from v. So, by the argument given in the Subcase a, v will not be reissued as 
a new private value until sf^ is complete. 

Case 2: sf^ got v from another process Pq. 

Claim. Process Pq has obtained v directly from Pp. 

Proof: Note sf^ got v by reading a label from Pq. That is, Pq writes v in the pth component 
of the label. To form a new label, Pq uses the j th component of the labels it reads from 
processes Pj (Statements 5-6 in LABELING). Hence, Pq obtains v directly from Pp. □ 

Let be the corresponding labeling operation execution. Note that each labeling op- 

eration execution also executes traceable Reads (Statement 5). Then 7r(Ti?g'^°') is TW^^"^ and 
TT{TRf^^) is TWI"^"^ . As argued in Case 1, either Twf'^ or Twf'+^^ stores v in myLendp[p\ [0..1] [q] 
This value will not be changed until is complete, in fact until Pq starts its next opera- 

tion execution 0^^°~^^\ Let TWp j' > j + 1, be the first traceable Write that changes the 
myLendp[p][0..1][g] values different from v. Then, it must have found Lq™°' is complete and 
the next operation execution of Pq, namely o\^°^'^\ has started. From Wq'^''°\lend[q,p]) — > 
of °+'l(0) L^'V) Lp'^'\ we have wl'^''\lend[q,p\) L^'+^l That is, Lp'^"^ 
and successive labeling operation executions of Pp would not reissue f if is found in lend[q,p\ 
(Statements 1-2). Note that Twl^°^ wih write V in Zen(i[g,p][l][(j'] at the end of its execu- 
tion (Statements 5-6 in TRACEABLE- WRITE). Also note that the traceable Write TWp'^ 
(actually l^p^'^) does not issue v. Now, from 7r(Ti?!'' ) is TWq^°^ it follows, by Lemmas ^ 



and 1^, that either TWq or TWq° would execute the ^/-statement for Pj, and write v 
in myLendg [p] [0..1] [i] indicating that the private value v of Pp is being used by Pi, and this 
will not be changed until sj^ is complete; in fact, until the next operation execution o|'^^' 
of Pi starts. Hence Lp ^ '^^ and successive labeling operation executions of Pp that may occur 
before sf^ is complete are able to trace v in lend[q,p\, and hence, will not reissue v. □ 

Corollary 1 It is clear from the proof of Lemma that if a Scan sf^ uses a private value v 

\k 1 \k 1 \l] 

of Pp which is written in labeling operation execution Lp' , then TWp ' {0[p\) --- TR\\p{0[p\) 
for direct reading and Twf'\o\p\) — TR^^p°\o\p\) — > Twl'''°\o[q]) — TRf^{0[q]) for 
indirect reading of v via process Pq. For the latter relation, by the axioms of Anger^, 
TwS''\0[p])-^TRfli0[q]) □ 

The following lemma shows that Scans can determine the correct temporal order of the 
private values of all processes. 
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Lemma 8 Let be a Scan that uses private values v and v' of a process Pp. Then, Sl^ can 
determine the correct temporal order between the values v and v' . 

Proof: Assume Scan sf^ uses the two different private values v and v' of process Pp that 
has written them in traceable Writes TW.jt^'^ and TWp respectively, where j < j' , and 
hence, v ^p v' (as defined in Section |^. By Lemma 0, Pp does not recycle v and v' until sf^ 
is complete. To guarantee the correctness of the timestamp system, we need to make sure 
that Sf can correctly determine the order v -<p v' in case these values are used in ordering 
some of the scanned labels. From the LABELING and SCAN routines and Corollary ||, we 
have Wp'\order\p,i]) — > TWp'\o\p\) — TRf^^{0[q]) — > Rf{order\p,i]), where q is as 
defined in Corollary ||. That is, Wp ^'^ (order[p, i\) — > (order [p, i\). Now, we need to make 
sure that Lp ^ can correctly determine that the private value v is being used by the process 
Pj, before writing order[p,i]. Off course, it would assume v' could be used by Pi too. As it 
knows V -<p v' , to inform this ordering to Pi, it writes v at a lower indexed entry in order[p, i] 
than v'. The successive labeling operation executions do not change this ordering. Thus, Pi 
can determine the order of v and v' correctly after reading order[p,i], by the regularity of 
order variables. 

[k i] 

Now we answer the question how Lp^ finds that v might be used by Pi. Note that Pp 
does not know precisely which of its private values Pi is going to use. So, it guesses a subset 
of its private values, which contains the values actually being used by Pi. There are two cases 
to be considered. 

[fc 1 ffc'+il 

Case 1. Pi obtains v directly from Pp. Either TWp or TWp will reserve v for Pi by 
storing v in /en(i[p,p][0..1][i], and hence the use of v by Pi is traceable. 

Case 2. Pi obtains v indirectly through another process Pq, for some q. From the claim in the 
proof of Lemma ^, we know that Pq has obtained v directly from Pp. Let the corresponding 
labeling operation execution be Either TW^^^^ or TW|[^^^^' will set /en(i[p,p][0..1][g] 

to V, and Pp assumes v could be used by any process Pi through 0[q] (one level of indirect 
propagation of a private value). At the end of Lq"°\ in TWq^°\ Pq informs Pp that v is in 
0[q] by setting /en(i[g,p][l][g] to v (Statements 5-6), and this value could be used by any 
process Pi. Alternatively, if Pq detects that the v is being used by Pi, it informs "this using" 
through /end[g,p][0..1][i] (Statements 4.2.2 and 6). 

[k ■;] 

Hence, if Lp ^ finds v in lend\p,p] [0..1] [i] or Zen(i[p,p][0..1][g] or lend[q,p] [l][q] or Zen(i[g,p][0..1][i], 
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for some q, it will assume that v is being used by Pi (Statements 1 and 4.1 in LABELING 
procedure). 

The assertion follows. □ 
Claim 1 Each order variable is of size at most 5n. 

Proof : As discussed in the proof of Lemma ^, Pp needs to reserve its private values referred 
to in /end[g,p][0..1][i], /en(i[g,p][l][q] and /en(i[p,p][0..1][g] for all q, that is, at most 5n values 
for process Pj. The claim follows. □ 

Corollary 2 The set of private values is bounded. In fact, by Statements 1-2 in the LABEL- 
ING procedure, the size of the set is less than 2n^ . □ 

By the discussion at the end of 3rd paragraph, Section the correctness of the pro- 
posed construction is immediate. However, for the sake of completeness, we give the proof in 
Theorem [l|. Before that a technical lemma follows. 

Lemma 9 Let TRf^„ and TR^t 1, be two traceable Reads such that Ti?f ' — TR^-, \ and 
7r(rpf J,) be Twf'^ . Then, 

(a) wl''^\c[p])^R^\c[p\), 

(b) 7r{TRfl) is TWp''\ where f > j, kj, > kj. 

Proof: We have the following two cases. 

Case 1: TFlf^^ finds ri.p ^ Wp^i on Rf''^\wp^i). 

Lemma|(6) implies that 7r(i?f (c[p])) is wl!''\c[p]). Then, we have Twl^'-'^ — > wj!''\c\p\) 
r!\c\p]) ^ Rf>'\wp,,) ^ Rl'\c[p]). 

Case 2: TPf^^ finds rj^p = Wp^i on Rf''^\wp^i). 

By Lemma ^ we have Twl^'-'^ — > wj!''\c\p\) — > wj!''\wp,i) =^ R!-''^\wp,i) — > 

For both the cases we have Wp {c\p\) =^ R-, {c[p]); part (a) follows. If TR^, ^ finds 
/ Wp^i' on pf, ''^\wp^i'), then part (6) follows by Lemma ^. Assume TPf, ^ finds r^/^p = Wp^ii 
on pf, ''^\wp^i'). From the above two cases, we have TWp'^^'^^ — > pf, '^\wp^i'). Then part (b) 
follows by Lemmas ^ and ^. □ 



21 



Theorem 1 The construction of Figure ^ is a correct implementation of wait-free bounded 
concurrent timestamp systems. 



Proof: The wait-freedom property is immediate from the structm'e of the four routines in 
Figure The boundedness follows from Corollary ^. We now show that the construction 
satisfies all the four properties P1--P4 described in Section ||. 

Ordering: Consider two labeling operation executions Ljf' and L^q ^ with labels Ip^^ and Iq' ' , 
respectively. Let m be the least significant index such that /p^'[m] ^ lq^\m]. Assume these 
private values /|f ^ [m] and Iq' ^ [m] are written by Pm at labeling operation executions and 
respectively. We define L^'' iff L^^ — > 

• Precedence: Without loss of generality we assume ijf^ — > L^q ^ . By Lemmas ^ and ^, 
we have TT{TRf}m) is Twi?°^ and ^(Tijfl) is Twi?°'l Then, from Ti?f;L — > TPfX 
and Lemma ^(6), we have Sq' > So- As /p^'[m] 7^ /g^'[m], we have So' 7^ So, and hence, 
So' > So- That is, Lm°' — > Lm°'^ ■ The precedence property follows. 

\k] [fc'l 

• Consistency: For any two labels Ip and Iq (returned by a Scan) such that m is the least 
significant index for which /jf'[m] 7^ ^g^ 'i'^]- We define -< /g^ ' iff /p^'[m] /^^^[m] 
iff Lm°^ — y Lm°'^ . The consistency property follows by Lemma |^ and the definition of 
=^ given above. 

Regularity: Consider a Scan that returns a label Ip ° that is written by a labeling 
operation execution Lp™°', that is, 7r(Ti?|-'p) is Twl"^°K By Lemmas ^ and we can say 
TWp^°^ -->- Ti^pp, and hence, Lp"°' S^^K The second part of the regularity property 
follows from: (i) if TR^^ finds r^^p 7^ Wp^i on i?!"''^^ (liJp j), then, by Lemma ^, 7r(Ti?Pp) is 
TTyj'^"], where 7r(i?!-''](c[p])) is (c^]), and so, Twjr°+'^ ^ TR^^, and hence 1^^°+'-'^ -/-^ 

Sf; (ii) if Ti?j-|J finds r^^p = Wp^i on then, by Lemma |, vr(ri?!-|J) is rVFj'^"^, 

where ^ VFj"'°^(u;p,i) ^ and so, r^J"'°+'l 7^ TijJ^J, and hence 



Monotonicity: Consider two Scans 

Si — ' S^i Let SI" return label rp"" from a process Pp. 
By Lemmas | and |, we have ^(TijJJ) is TW,!'""'. From 5^ — ^ sj/'^, we have TR^^^ — > 
TR^ij^. The monotonicity property follows by Lemma ^ 

Extended regularity: Consider a Scan 5*^^ that returns a label Ij, ° that is written by a labeling 
operation execution Lp^°\ that is, 7r{TR^^p) is TWp^°\ For each labeling operation execution 
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L^'"'^ , if — > , then Ti^j-^J — > TrI^^ . Then, by Lemma |(a), we have wjT"^ {c[p\) =^ 
Rt^\c\p]) and hence, 7r(Ti?g'^') is Twl^°^ or a successor, by Lemma ^(6). Also by Lemma ^ 
and I and the LABELING procedure, we have Ti?),™"' — > TWjT"^ — Ti^JJ — > Ti^^^'^ for 
aU s ^ p, that is, TRp^s°^ — > TR^^s^. Hence, -Lg" ^ reads more recent (at least equal) private 
values of all processes than Lp™"'. Also, we have 1^°^ [q] -<g 4™ ^ [ol- Hence L^^°^ =^ L^"^ The 
extended regularity property follows. □ 



5 Concluding Remarks 

This paper combines the preliminary ||4^, The former paper is the first to characterize 
multiwriter shared variables, and provides a bounded construction of multiwriter multireader 
multivalued atomic variable from 1-writer variables. However, it was later found that the 
proposed construction doesn't satisfy some properties of atomic shared variables |43|. The 



technical report |]1^] corrected and extended to a construction of a concurrent timestamp 
system using an idea from jl^. The final result is very close to the incorrect construction 
of |42|. It uses O(nlogn) bit size shared variables (order and lend variables), where n is 
the number of processes. Scan and labeling operation executions require 0(n) steps. The 
construction uses less shared space than that of |10| at the fundamental level, and is orders 



of magnitude more efficient in terms of scanning bits at the fundamental level. 



5.1 Comparison with Related Work 



In |10|, they have defined three routines, namely, traceable-read, traceable-write and garbage 
collection. When the traceable-read function is executed to read a label, the executing process 
explicitly informs the other processes which of their private values it is going to use. The 
traceable-write procedure is executed to write a new label. To determine which of its private 
values are currently in use, a process executes the garbage collection routine. This routine 
helps processes to safely recycle their respective private values. This is the most intricate 
routine. 

In our construction, we have used a separate implementation technique for a weaker form 
of the traceable-read and the traceable-write routines. We do not need a garbage collection 
routine. When a process executes the traceable-read function, it does not explicitly inform 
the other processes which of their private values it is going to use. On the other hand. 
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the executers of the traceable-write procedure correctly finds which private values of which 
processes are in use. 

Every process needs a separate pool of private values, whose size is fewer than 2n^. In their 
construction, the pool size is 22n^. All the ordering shared variables used in our construction 
are of 1-writer 1-reader regular ones, whereas they are 1-writer n-reader atomic ones in their 
construction. In our construction, a Scan reads at most n—1 1-writer 1-reader regular ordering 
shared variables, whereas in their construction it is 2n — 2 1-writer n-reader atomic ones. In 
our construction all but one bit are nonatomic 1-writer 1-reader variables. Table || presents 
some comparison results briefly. 



Table 1: Comparison Results. 



Construction 


Shared variable size 


Shared space (bits) 


Labeling 


Scan 




i 


0(n) 


O(n^) 


0(n) 


0(n^ log n) 




g 


0{n') 


0(n4) 


0{n log n) 


0(n log n) 


1 




0{n') 


0(n4) 


0{n) 


0(n) 


1 




0(n log n) 


0{n^ log n) 


0{n) 


0(n) 


1 




0(n) 


0(n3) 


0{n) 


0(n) 


This paper 


0(n log n) 


0{n'^ log n) 


0{n) 


0(n) 



Of all proposed constructions of bounded concurrent timestamp systems we are aware of, 
the construction in this paper is the 'simplest'. The correctness proof, though involved, is 
easier to follow. It is used as a basis in the reference text Q to describe bounded concurrent 
timestamp system. 



Although we have used a notion of vector clocks for our construction, as in [42|, we may 
not really need the full power of vector clock concept developed later by Mattern ||3^. In 
CTSs, we are not interested in determining causal 'independence' of various labeling operation 
executions. The ordering property of CTSs infers that the causal orders among labeling 
operation executions matter most. We need to have a total order on all labeling operation 
executions, and the total order must extend their original causal relation. This is akin to the 
logical time of Lamport |27]. We suspect that there might be a way to eliminate the vector 
clock altogether, by an efficient way of recycling of global values, instead of using n sets of 
private values. 

The construction presented here should not be considered as an alternative implementation 
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of the traceable use abstraction, for it restricts the value propagation at indirection level one. 
It is not clear to the authors how this strategy could be extended for a general implementation 
of the abstraction. 



5.2 A Brief Early History 

The development of bounded wait-free shared variables and timestamp systems has been quite 
problematic and error-prone. It may be useful at this point to present a brief early history 
of the area: who did what, when, and where, and which solutions are known to be incorrect. 



In a series of papers [25, Eq, 27, ES[| starting in 1974, Lamport explored various notions of 



concurrent reading and writing of shared variables culminating in the seminal 1986 paper 



1 29]. It formulates the notion of wait-free implementation of an atomic shared variable — 
written by a single writer and read by (another) single reader — from safe 1-writer 1-reader 
2- valued shared variables, being mathematical versions of physical flip-flops. Predating the 



latter paper, in 1983 Peterson |34] published an ingenious wait-free construction of an atomic 
1-writer, n-reader m- valued atomic shared variable from n + 2 safe 1-writer n-reader m- valued 
registers, 2n 1-writer 1-reader 2-valued atomic shared variables, and 2 1-writer n-reader 2- 
valued atomic shared variables. He presented also a proper notion of wait-freedom property. 
Lamport [^] gave an example that appeared to contradict a possible interpretation of the 
informal statement of a theorem in [04| ], which, as Peterson apparently retorted to Lamport, 
was not intended. In his paper, Peterson didn't tell how to construct the n-reader boolean 
atomic variables from flip-flops, while Lamport mentioned the open problem of doing so, and, 
incidentally, uses a version of Peterson's construction to bridge the algorithmically demanding 
step from atomic shared bits to atomic shared multivalues. Based on this work, N. Lynch, 
motivated by concurrency control of multi-user data-bases, posed around 1985 the question of 
how to construct wait-free multiwriter atomic variables from 1-writer multireader atomic vari- 
ables. Her student Bloom |||] found in 1985 an elegant 2-writer construction, which, however. 



has resisted generalizations to multiwriter. Vitanyi and Awerbuch [42| were the first to define 
and explore the complicated notion of wait-free constructions of general multiwriter atomic 
variables. They presented a proof method, an unbounded solution from 1-writer 1-reader 
atomic variables, and a bounded solution from 1-writer n-reader atomic variables. The un- 
bounded solution was made bounded in ||3l[|. It is optimal for the implementation of n- writer 



n-reader atomic variables from 1-writer 1-reader ones. "Projections" of the construction also 
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give specialized constructions for the implementation of 1-writer n-reader atomic variables 
from 1-writer 1-reader ones, and for the implementation of n- writer n-reader atomic variables 
from 1-writer n-reader ones. As noted in |^^, the first "projection" is optimal, while the 
last "projection" may not be optimal since it uses 0(n) control bits per writer while only a 
lower bound of Q,{logn) was established. Taking up this challenge, the construction in |23] 
apparently achieves this lower bound. The earlier bounded solution in [^2[ (corresponding 
in fact to the problem correctly solved by the last "projection" above) turned out not to be 
atomic, but only achieved regularity |^3[. Nonetheless, |42| introduced important notions and 
technique in the area, like (bounded) vector clocks. These were inspired by the celebrated 
"Bakery" algorithm of Lamport [^5|, which can be viewed as a global bounded "clock" deter- 
mining the order among queued processes much like the ticket dispenser in a bakery serves 
to determine the order of servicing waiting customers. The multiwriter situation has stronger 
requirements than apparently can be satisfied by a global ticket dispenser. The solution in 
1 42] was the construction of a bounded "vector clock": a private ticket dispenser for each 
process, the storing and updating of a vector of latest tickets held by all processes, together 
with a semantics to determine the order between vectors. Moreover, a complex mechanism — 
primitive traceable read/write — is presented to keep track of which tickets of what processes 
could still be present in the system, with the objective of bounding the private ticket pool 



of each process by recycling obsolete tickets. Following the appearance of |42], Peterson who 
had been working on the multiwriter problem for a decade, together with Burns, revamped 
the construction retaining the vector clocks, but replaced the primitive traceable read/ write 



elements by repeated scanning as in |34]. The result [^] was found to be nonetheless er- 
roneous, in the technical report [^]. This makes the multiwriter problem perhaps the only 
one for which two consecutive wrong solutions were published in the highly selective FOCS 
conferences. Neither the re-correction in [36|, nor the claimed re-correction by the authors 
of [35] has appeared in print. The present paper constitutes a correction of the original ]42] 
by the extension of [iH]: by implementing the stronger concurrent timestamp system it also 



solves the atomic multiwriter problem. Apart from the already mentioned ]31], the only other 
multiwriter multireader atomic shared variable construction that appeared in journal version 
seems to be of Abraham |l|. Also in 1987 there appeared at least five purported solutions for 
the implementation of 1-writer n-reader atomic shared variable from 1-writer 1-reader ones: 
24, ^, 37] and the conference version of ]21], of which ]|6| was shown to be incorrect in ]15] 
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and only [37| appeared in journal version. The only other 1-writer n-reader atomic shared 



variable construction appeared in journal version is of Haldar and Vidyasankar [16|. A. Is- 
raeli and M. Li were attracted to the area by the work in and, in an important paper 
1 21], they raised and solved the question of the more general and universally useful notion 



of bounded timestamp system to track the order of events in a concurrent system. Their se- 
quential timestamp system was published in journal version, but the preliminary concurrent 
timestamp system in the conference proceedings, of which a more detailed version has been 
circulated in manuscript form, has not been published in final form. 

The difficulty of wait-free atomic multireader-, multiwriter-, and timestamp system con- 
structions, and the many errors in purported and published solutions, have made it hard to 
publish results in print. Of the major pioneering papers, the first correct multiwriter con- 
struction of 1987 [|T| was rejected at five consecutive conferences until it was published in 



ICALP, 1989. The final journal version was handled by three consecutive editors, scrutinized 
by three consecutive sets of referees, and lasted from 1989 until publication in 1996. The 



pioneering timestamp paper, |21], was submitted in 1987/88 to this journal, after a couple 
of years rejected since a stronger result Q had appeared in conference version, submitted to 
another journal and finally appeared in 1993, but only the part containing the simpler sequen- 
tial timestamp construction. The first generally accepted concurrent timestamp construction 
||9| appeared in conference version in 1989, but its journal version appeared only in 1997. As 
stated before, the concurrent timestamp construction in the present paper is based on the 
1986 paper [^] supplemented by the 1993 technical report |18|. For further remarks see |31 
in this journal and the Introduction to present paper. 
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Declarations 



Constants: 

n = number of processes; 

Type: 

label- type: array [l..n] of natural number; {represents vector clock} 
boolean: 0..1; 

Shared variables and their initial values: 

w : array [l..n, l..n] of boolean atomic; {all initially 0} 

{Pp writes w\p,l..n\ and Pi reads ■u;[l..n, i]} 

r : array [l..n, l..n] of boolean atomic; {all initially 0} 

{Pp writes r[p, l..n] and Pi reads r[l..n,i]} 

c : array [l..n] of boolean atomic; {initially 0} 

{Pp writes c\p], and the others read} 

label : array [l..n, 0..1] of label-type safe; {all initially 0, except label[p,0][p] = 1 for all p} 

{Pp writes label\p, 0..1] and the others read} 

copylabel : array [l..n, l..n\ of label- type safe; 

{Pp writes copylabel\p,l..n] and Pi reads copylabel[l..n, 

lend: array [l..n, l..n] of regular array [0..1] of label-type; {all initially 0} 

{Pp writes lend\p, l..n] and Pi reads lend[l..n,i]} 

order: array [l..n, l..n] of regular array [1..5n] of natural number; 

{initially order[l..n, l..n][l] = and order[l..n, l..n][2] 
{Pp writes order\p, l..n] and Pi reads order[l..n,i\} 



Private variables for process Pp, p= 1,2, ... ,n: 

dp: boolean; {initially 0} 

myLendp: array [l..n] of array [0..1] of label-type; {all initially 0} 

old-labelp: label-type; {all initially 0, except old-labelp\p] = 1} 

-<p: total order relation; {initially {(0,1)}} 



Figure 1: Shared variables. 
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Procedure TRACEABLE- WRITE(j9: l..n; new-label: label-type); {Pp writes new-label in 0\p]} 
var 

l..n; {loop index} 
Ir: boolean; 
begin 

1. • — ^ctp^ 

2. write new-label in label\p, dp]; 

3. write clp in c[p]; 

4. for i := 1 to n do 

begin {cowZd be done in parallel} 

4.1 read Ir from r[i,p]; 

4.2 if Ir ^ w\p, i] then 

4.2.1 write new-label in copylabel\p,i]; 

4.2.2 for J := 1 to n do myLen(ip[7][0..1][i] := {old-labelp[j], new-label[j]); 

4.2.3 write Zr in w [p, i] ; {w[p, z] = r[z,p]} 
endif; 

endfor; 

5. for J := 1 to n do mj/LencZp [7] [1] [p] := new-label[j]; 

6. for j := 1 to n do write myLendp[j] in lend\p,j]; {could be done in parallel} 

7. old-labelp := new-label; 
end; {of procedure} 

Function TRACEABLE-READ (p: l..n, i: l..n): label-type; {Pp reads a label from 
var 

Iw: boolean; 
Ic: boolean; 
savelabel: label-type; 

begin 

1. read Iw from w[i,p\; 

2. write ^Iw in r[p, i]; {'^bi^l 7^ ''^[hP]} 

3. read Ic from £[«]; 

4. read savelabel from Za6eZ[i,Zc]; 

5. read Iw from u;[i,p]; 

6. if {r[p,i] 7^ hi)) then return(saueZa6e/) 

7. else {r[p, = i(;[z,p]} 

read and return(copyZa6eZ[i,p]) 
endif; 
end; {of function} 

Figure 1: Construction for process Pp. (Cont'd.) 
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Procedure LABELING (p: L.n); 
var 

j,k: l..n; 

temp: array [l..n] of array [0..1] of label-type; 
lab: array [l..n] of label-type; 
new-label: label-type; 
private-value: natural number; 
begin 

1. for j := 1 to n do {could be done in parallel} 

read temp\j] from lend\j,p]; {we do not need iemp[j][0][j]} 

2. select a new private-value not in temp[l..n] and the current private value; {use the axiom of choice here} 

3. put the new private-value in ~<p as the largest element; 

4. for J := 1 to n do {could be done in parallel} 

4.1 order the elements of (temp[l..n] [0..1] [j], 

temp[fc][l][/c] and 
temp[p] [0..1] [fc] for all k, 

and the new private-value) consistent with -<p 

4.2 and write them in order [p,j]; 

5. for j := 1 to n, j p, do lab[j] :=TRACEABLE-READ(p,_7); {could be done in parallel} 

6. new-label := {lab[l][l],lab[2][2], . . . ,lab\p\\p\ := private-value, ... ,lab[n\[n\); 

7. TRACEABLE- WRITE (p, new-label); 
end; 

Function SCAN(p: l..n):(I,^); 

var 

i,j,k: l..n; 

lab: array [l..n] of label- type; 

begin 

1. for j := 1 to n do lab[j] :=TRACEABLE-READ(;j, j); {could be done in parallel} 

2. for i := 1 to n do 
2.1 for j := 1 to n do 

2.1.1 let k be the least significant index in which lab[i] differs from lab[j]; 

2.1.2 if order[k,p] (which is a subset of ^fc) is not read yet, then read it; 

2.1.3 determine the order between lab[i] and lab\j] using -<k; 
end; 

Figure 1: Construction for process Pp. (Cont'd.) 



35 



